智能论文笔记

Scalable neural quantum states architecture for quantum chemistry

Tianchen Zhao , James Stokes , Shravan Veerapaneni

分类：机器学习

2022-08-11

量子状态的神经网络表示的变异优化已成功地用于解决相互作用的费米子问题。尽管发展迅速，但在考虑大规模分子时会出现重大的可伸缩性挑战，这些分子与非局部相互作用的量子自旋汉密尔顿人相对应，这些量子旋转汉密尔顿人由数千甚至数百万的保利操作员组成。在这项工作中，我们引入了可扩展的并行化策略，以改善基于神经网络的量子量蒙特卡洛计算，用于AB-Initio量子化学应用。我们建立了由GPU支持的局部能量并行性，以计算潜在复杂分子的哈密顿量的优化目标。使用自回旋抽样技术，我们证明了实现CCSD基线目标能量所需的壁锁定时间的系统改进。通过将最终的旋转汉顿量的结构适应自回归抽样顺序，进一步提高了性能。与经典的近似方法相比，该算法实现了有希望的性能，并且比现有基于神经网络的方法具有运行时间和可伸缩性优势。

translated by 谷歌翻译

A Probabilistic Model of Activity Recognition with Loose Clothing

Tianchen Shen , Irene Di Giulio , Matthew Howard

分类：机器人

2022-09-23

随着车身可穿戴感应技术的发展，人类活动的识别已成为一个有吸引力的研究领域。借助舒适的电子质地，传感器可以嵌入衣服中，以便可以长期记录人类运动。但是，一个长期存在的问题是如何处理通过相对于身体运动引入的运动人工制品。令人惊讶的是，最近的经验发现表明，与刚性连接的传感器相比，与固定的传感器相比，布置的传感器实际上可以实现更高的活动识别精度，尤其是在从短时间窗口中预测时。在这项工作中，引入了概率模型，其中通过织物传感记录的运动之间的统计距离增加了这种提高的准确性和呼吸。模型的预测在模拟和真实的人类运动捕获实验中得到了验证，很明显，这种反直觉效应是紧密捕获的。

translated by 谷歌翻译

Examining Audio Communication Mechanisms for Supervising Fleets of Agricultural Robots

Abhi Kamboj , Tianchen Ji , Katie Driggs-Campbell

分类：机器人

2022-08-22

农业面临着劳动危机，导致人们对小型，伪造机器人（AGBOTS）的兴趣增加，这些机器人可以执行精确的，有针对性的行动（例如，农作物侦察，除草，受精），同时由人类操作员进行监督。但是，农民不一定是机器人技术方面的专家，也不会采用增加其工作量的技术或不提供立即回报的技术。在这项工作中，我们探讨了远程人类操作员与多个Agbot之间进行通信的方法，并研究音频通信对操作员的偏好和生产率的影响。我们开发了一个模拟平台，在该平台中，AGBOT在一个字段中部署，随机遇到故障，并呼吁操作员寻求帮助。随着AGBOTS报告错误，测试了各种音频通信机制，以传达哪种机器人失败以及发生了什么类型的故障。人类的任务是在完成次要任务时口头诊断失败。进行了一项用户研究，以测试三种音频通信方法：耳塞，单短语命令和完整的句子通信。每个参与者都完成了一项调查，以确定他们的偏好和每种方法的总体效率。我们的结果表明，使用单个短语的系统是参与者最积极的看法，可以使人更有效地完成次要任务。该代码可在以下网址获得：https：//github.com/akamboj2/agbot-sim。

translated by 谷歌翻译

ImageTBAD: A 3D Computed Tomography Angiography Image Dataset for Automatic Segmentation of Type-B Aortic Dissection

Zeyang Yao , Jiawei Zhang , Hailong Qiu , Tianchen Wang , Yiyu Shi , Jian Zhuang , Yuhao Dong , Meiping Huang , Xiaowei Xu

分类：计算机视觉

2021-09-01

B型主动脉解剖（TBAD）是最严重的心血管事件之一，其特征在于每年的年龄发病率，以及疾病预后的严重程度。目前，计算机断层摄影血管造影（CTA）已被广泛采用TBAD的诊断和预后。 CTA中真菌（TL），假腔（FL）和假腔血栓（FLT）的精确分割对于解剖学特征的精确定量，CTA是至关重要的。然而，现有的作品仅关注TL和FL而不考虑FLT。在本文中，我们提出了ImageTBAD，TBAD的第一个3D计算断层造影血管造影（CTA）图像数据集具有TL，FL和FLT的注释。该建议的数据集包含100个TBAD CTA图像，与现有的医学成像数据集相比，这是体面的大小。由于FLT几乎可以沿着主动脉出现具有不规则形状的主动脉，FLT的分割呈现了各种各样的分割问题，其中目标存在于具有不规则形状的各种位置。我们进一步提出了一种用于TBAD的自动分割的基线方法。结果表明，基线方法可以通过现有的主动脉和TL分段实现与现有工作的可比结果。然而，FLT的分割精度仅为52％，这使大型改进室并显示了我们数据集的挑战。为了促进进一步研究这一具有挑战性的问题，我们的数据集和代码将发布给公众。

translated by 谷歌翻译

Saliency-Aware Spatio-Temporal Artifact Detection for Compressed Video Quality Assessment

Liqun Lin , Yang Zheng , Weiling Chen , Chengdong Lan , Tiesong Zhao

分类：计算机视觉

2023-01-03

Compressed videos often exhibit visually annoying artifacts, known as Perceivable Encoding Artifacts (PEAs), which dramatically degrade video visual quality. Subjective and objective measures capable of identifying and quantifying various types of PEAs are critical in improving visual quality. In this paper, we investigate the influence of four spatial PEAs (i.e. blurring, blocking, bleeding, and ringing) and two temporal PEAs (i.e. flickering and floating) on video quality. For spatial artifacts, we propose a visual saliency model with a low computational cost and higher consistency with human visual perception. In terms of temporal artifacts, self-attention based TimeSFormer is improved to detect temporal artifacts. Based on the six types of PEAs, a quality metric called Saliency-Aware Spatio-Temporal Artifacts Measurement (SSTAM) is proposed. Experimental results demonstrate that the proposed method outperforms state-of-the-art metrics. We believe that SSTAM will be beneficial for optimizing video coding techniques.

translated by 谷歌翻译

More is Better: A Database for Spontaneous Micro-Expression with High Frame Rates

Sirui Zhao , Huaying Tang , Xinglong Mao , Shifeng Liu , Hanqing Tao , Hao Wang , Tong Xu , Enhong Chen

分类：计算机视觉

2023-01-03

As one of the most important psychic stress reactions, micro-expressions (MEs), are spontaneous and transient facial expressions that can reveal the genuine emotions of human beings. Thus, recognizing MEs (MER) automatically is becoming increasingly crucial in the field of affective computing, and provides essential technical support in lie detection, psychological analysis and other areas. However, the lack of abundant ME data seriously restricts the development of cutting-edge data-driven MER models. Despite the recent efforts of several spontaneous ME datasets to alleviate this problem, it is still a tiny amount of work. To solve the problem of ME data hunger, we construct a dynamic spontaneous ME dataset with the largest current ME data scale, called DFME (Dynamic Facial Micro-expressions), which includes 7,526 well-labeled ME videos induced by 671 participants and annotated by more than 20 annotators throughout three years. Afterwards, we adopt four classical spatiotemporal feature learning models on DFME to perform MER experiments to objectively verify the validity of DFME dataset. In addition, we explore different solutions to the class imbalance and key-frame sequence sampling problems in dynamic MER respectively on DFME, so as to provide a valuable reference for future research. The comprehensive experimental results show that our DFME dataset can facilitate the research of automatic MER, and provide a new benchmark for MER. DFME will be published via https://mea-lab-421.github.io.

translated by 谷歌翻译

Surveillance Face Anti-spoofing

Hao Fang , Ajian Liu , Jun Wan , Sergio Escalera , Chenxu Zhao , Xu Zhang , Stan Z. Li , Zhen Lei

分类：计算机视觉

2023-01-03

Face Anti-spoofing (FAS) is essential to secure face recognition systems from various physical attacks. However, recent research generally focuses on short-distance applications (i.e., phone unlocking) while lacking consideration of long-distance scenes (i.e., surveillance security checks). In order to promote relevant research and fill this gap in the community, we collect a large-scale Surveillance High-Fidelity Mask (SuHiFiMask) dataset captured under 40 surveillance scenes, which has 101 subjects from different age groups with 232 3D attacks (high-fidelity masks), 200 2D attacks (posters, portraits, and screens), and 2 adversarial attacks. In this scene, low image resolution and noise interference are new challenges faced in surveillance FAS. Together with the SuHiFiMask dataset, we propose a Contrastive Quality-Invariance Learning (CQIL) network to alleviate the performance degradation caused by image quality from three aspects: (1) An Image Quality Variable module (IQV) is introduced to recover image information associated with discrimination by combining the super-resolution network. (2) Using generated sample pairs to simulate quality variance distributions to help contrastive learning strategies obtain robust feature representation under quality variation. (3) A Separate Quality Network (SQN) is designed to learn discriminative features independent of image quality. Finally, a large number of experiments verify the quality of the SuHiFiMask dataset and the superiority of the proposed CQIL.

translated by 谷歌翻译

EZInterviewer: To Improve Job Interview Performance with Mock Interview Generator

Mingzhe Li , Xiuying Chen , Weiheng Liao , Yang Song , Tao Zhang , Dongyan Zhao , Rui Yan

分类：自然语言处理

2023-01-03

Interview has been regarded as one of the most crucial step for recruitment. To fully prepare for the interview with the recruiters, job seekers usually practice with mock interviews between each other. However, such a mock interview with peers is generally far away from the real interview experience: the mock interviewers are not guaranteed to be professional and are not likely to behave like a real interviewer. Due to the rapid growth of online recruitment in recent years, recruiters tend to have online interviews, which makes it possible to collect real interview data from real interviewers. In this paper, we propose a novel application named EZInterviewer, which aims to learn from the online interview data and provides mock interview services to the job seekers. The task is challenging in two ways: (1) the interview data are now available but still of low-resource; (2) to generate meaningful and relevant interview dialogs requires thorough understanding of both resumes and job descriptions. To address the low-resource challenge, EZInterviewer is trained on a very small set of interview dialogs. The key idea is to reduce the number of parameters that rely on interview dialogs by disentangling the knowledge selector and dialog generator so that most parameters can be trained with ungrounded dialogs as well as the resume data that are not low-resource. Evaluation results on a real-world job interview dialog dataset indicate that we achieve promising results to generate mock interviews. With the help of EZInterviewer, we hope to make mock interview practice become easier for job seekers.

translated by 谷歌翻译

Follow the Timeline! Generating Abstractive and Extractive Timeline Summary in Chronological Order

Xiuying Chen , Mingzhe Li , Shen Gao , Zhangming Chan , Dongyan Zhao , Xin Gao , Xiangliang Zhang , Rui Yan

分类：自然语言处理

2023-01-02

Nowadays, time-stamped web documents related to a general news query floods spread throughout the Internet, and timeline summarization targets concisely summarizing the evolution trajectory of events along the timeline. Unlike traditional document summarization, timeline summarization needs to model the time series information of the input events and summarize important events in chronological order. To tackle this challenge, in this paper, we propose a Unified Timeline Summarizer (UTS) that can generate abstractive and extractive timeline summaries in time order. Concretely, in the encoder part, we propose a graph-based event encoder that relates multiple events according to their content dependency and learns a global representation of each event. In the decoder part, to ensure the chronological order of the abstractive summary, we propose to extract the feature of event-level attention in its generation process with sequential information remained and use it to simulate the evolutionary attention of the ground truth summary. The event-level attention can also be used to assist in extracting summary, where the extracted summary also comes in time sequence. We augment the previous Chinese large-scale timeline summarization dataset and collect a new English timeline dataset. Extensive experiments conducted on these datasets and on the out-of-domain Timeline 17 dataset show that UTS achieves state-of-the-art performance in terms of both automatic and human evaluations.

translated by 谷歌翻译

Fusing Models for Prognostics and Health Management of Lithium-Ion Batteries Based on Physics-Informed Neural Networks

Pengfei Wen , Zhi-Sheng Ye , Yong Li , Shaowei Chen , Shuai Zhao

分类：人工智能 | 机器学习

2023-01-02

For Prognostics and Health Management (PHM) of Lithium-ion (Li-ion) batteries, many models have been established to characterize their degradation process. The existing empirical or physical models can reveal important information regarding the degradation dynamics. However, there is no general and flexible methods to fuse the information represented by those models. Physics-Informed Neural Network (PINN) is an efficient tool to fuse empirical or physical dynamic models with data-driven models. To take full advantage of various information sources, we propose a model fusion scheme based on PINN. It is implemented by developing a semi-empirical semi-physical Partial Differential Equation (PDE) to model the degradation dynamics of Li-ion-batteries. When there is little prior knowledge about the dynamics, we leverage the data-driven Deep Hidden Physics Model (DeepHPM) to discover the underlying governing dynamic models. The uncovered dynamics information is then fused with that mined by the surrogate neural network in the PINN framework. Moreover, an uncertainty-based adaptive weighting method is employed to balance the multiple learning tasks when training the PINN. The proposed methods are verified on a public dataset of Li-ion Phosphate (LFP)/graphite batteries.

translated by 谷歌翻译